Taxonomy
1) metric-based: learn a good metric
2) optimization-based: gradient
- Meta-Learner LSTM [5]
 - MAML [6] [7] [8]
 REPTILE (an approximation of MAML) [9]

Optimization based methods aim to obtain good parameter initilization. If we simply train multiple tasks, the obtained model parameters may lead to sub optimum for each task.

3) model-based: predict model parameters
Reference:
- Vinyals, Oriol, et al. “Matching networks for one shot learning.” NIPS, 2016.
 - Sung, Flood, et al. “Learning to compare: Relation network for few-shot learning.” CVPR, 2018.
 - Snell, Jake, Kevin Swersky, and Richard Zemel. “Prototypical networks for few-shot learning.” NIPS, 2017.
 - Ren, Mengye, et al. “Meta-learning for semi-supervised few-shot classification.” arXiv preprint arXiv:1803.00676 (2018).
 - Sachin Ravi and Hugo Larochelle. “Optimization as a Model for Few-Shot Learning.” ICLR, 2017.
 - Chelsea Finn, Pieter Abbeel, and Sergey Levine. “Model-agnostic meta-learning for fast adaptation of deep networks.” ICML, 2017.
 - Finn, Chelsea, and Sergey Levine. “Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm.” arXiv preprint arXiv:1710.11622 (2017).
 - Grant, Erin, et al. “Recasting gradient-based meta-learning as hierarchical bayes.” arXiv preprint arXiv:1801.08930 (2018).
 - A. Nichol, J. Achiam, and J. Schulman. On first-order meta-learning algorithms. arXiv, 1803.02999v2, 2018.
 - Adam Santoro, et al. “Meta-learning with memory-augmented neural networks.” ICML. 2016.
 - Munkhdalai, Tsendsuren, and Hong Yu. “Meta networks.” ICML, 2017.